Risk factors and clustering of mortality among older adults in the India Human Development Survey

With wide socioeconomic mortality differential among older adults in India, a constant question of death clustering across high-risk families and communities arises. The present study uses a follow-up survey from India to investigate the socioeconomic, demographic and health predictors of old-age mortality clustering. Data of 16,964 older adults nested within 12,981 households from 2352 communities were used from India Human Development Survey (IHDS) round-I (2005) who were further tracked down in round-II (2012). Bivariate association between the determinants of old-age mortality was investigated using the log-rank test. The multivariate analysis involved estimating the random-intercept Weibull proportional hazard model with three levels—individual (level 1), family (level 2) and community (level 3). We analyzed the sensitivity of multivariate results to unobservable variable and selection biases using the e-value method. The empirical analysis confirms that the risk of mortality is significantly heterogeneous between the families. The health status of older adults and the family’s socioeconomic status in the early years emerged as prominent predictors of a longer lifespan. With a strong association between household income and mortality hazard risk, the present study urges early life interventions as those started in late-life might have negligible impact on keeping the older adults alive and healthy.

Though marriage had a significant protective impact on the lives of individuals, males were less likely to be alive in follow-up surveys 10 . A study from Bangladesh provided evidence that being head of the household and residing with a spouse or son helped reduce mortality among older adults 11 . Studies from Ethiopia, New Zealand, Israel and the United States found that living in a rural area, having different ethnic groups and continents of origin, and experiencing financial hardship or stress can easily trigger mortality at older ages [12][13][14][15] . Such differences in mortality risks across socioeconomic, cultural and environmental conditions suggest unequal distribution of mortality risks among older adults. These deaths may be clustered among certain families and communities, putting them under the higher-risk categories.
In India, 60 years and above population is projected to rise 13.2% in 2031 16 . Lesser consideration towards the older adults can create a more significant loss in the future in the form of old-age mortality or repercussions like catastrophic health spending, social and financial insecurity and physical, social and emotional distress 17,18 . Few existing literatures from India have shown the effect of age, gender, caste and living standard on old-age mortality 19,20 . Studies have shown that prior co-morbidities among older adults have further worsened the old-age mortality risks. Thus, maintaining a healthy lifestyle that involves eating a balanced diet, physical activity, and avoiding substance abuse has contributed to fewer diseases, further reducing the mortality risk in older ages 6 . Despite knowledge of such determinants, the quality and quantity of life of older adults in India vary across families and communities. This brings the need to understand the risk factors for such unequal distribution of mortality risks among older adults by considering heterogeneity at the household and community levels. The present study improves upon the limitations of extant studies and aims to examine the risk factors of old-age mortality in India using a multilevel survival approach based on a nationally representative survey. Present study also uses the follow up survey data to indicate the predictors of old-age mortality and contributes to the recent literature in this area through robust evidence.

Methods
Data. This research article utilized the India Human Development Survey (IHDS) wave-I and wave-II, jointly administered by the National Council of Applied Economic Research (NCAER) and the University of Maryland. IHDS is a nationally-representative, multi-topic, large-scale survey that provides essential information on health and morbidity, education, employment and economic status, fertility and marital relations, and social capital of the Indian population. IHDS wave-I and wave-II were conducted during 2005 and 2012, respectively, across all India's states and union territories except for Andaman & Nicobar Islands and Lakshadweep. Both waves of IHDS adopted a multistage stratified random sampling design, and further details on sampling design, data collection and informed consent are available elsewhere 21,22 . Notably, IHDS wave-II was a panel survey, which re-interviewed 83% of the original IHDS wave-I households. Further details regarding the IHDS wave-II panel component are available in the user guide 22 .
This study refers to persons aged 60 years and above as older adults. Intending to examine old-age mortality, this study utilized the tracking sheet data of IHDS wave-II from 2005 to 2012. Further, to explore the determinants of mortality in older adults, we merged the individual-, household-and community-level information from wave-I with the tracking sheet information in wave-II. The analytical sample of this study is 16,964 older adults residing in 12,891 families and nested within 2,352 communities in India.
Mortality statement. The information regarding the mortality status of older adults was obtained from the IHDS wave-II tracking sheet data. With the aim of re-interviewing wave-I households during wave-II, the IHDS collected data on the status of all wave-I respondents during wave-II (this information comprised the tracking sheet data). Notably, during wave-II, IHDS gathered information of the survival status of respondents and the year of death prior to wave-II if respondents were not alive. Therefore, this information on survival status and survival time was used to analyze the mortality of older adults in India. All older adults who died during this period were coded as "Yes"; otherwise, they were coded as "No". Statistical methods. At the outset, we examined the sample distribution of older adults. Next, we estimated the incidence rate of old-age mortality between 2005 and 2012 and grouped it by gender and age group. Further, we performed bivariate and multivariable analyses to achieve the study objectives. Note that the mortality data described in "Mortality statement" contain censored observations (those older adults who did not experience mortality between wave-I and wave-II and older adults who were lost to follow-up). Therefore, in the bivariate analysis, we calculated the mean survival duration of older adults across the categories of risk factors by accounting for censoring in the data 23 . Further, log-rank tests were performed to examine the association between the risk factors and older adults' mortality status by adjusting for censored cases. Statistical details of the log-rank test are available elsewhere 23 .
The multivariable analysis involved estimating random-intercept parametric survival regression models. Survival regression models help utilize the information from censored records in the retrospective life-course data, thereby curtailing the loss of crucial information 23 . Notably, parametric survival regression models have the advantage of more efficiently utilizing the information from censored cases compared to semi-parametric regression models 23 . In the survival models, our event of interest is the binary survival status of the older adults between IHDS 2005 and 2012.
Additionally, parametric survival regression models allow us to choose the underlying statistical distribution of time-to-old-age mortality 23 . Based on theoretical knowledge and statistical evidence, we use the Weibull proportional hazard model in our study. The Weibull regression model is appropriate when the hazard of the failure event (here, risk of mortality) is either monotonically increasing or decreasing 23 . Based on existing knowledge of human mortality, we know that the risk of mortality rises steadily among older adults with progressing age 24 www.nature.com/scientificreports/ similar trend is observed in our data (see Fig. 2) of Indian older adults. Therefore, using the Weibull regression hazard model to estimate mortality risk among Indian older adults is theoretically justified 25 . The statistical fit of the models was examined by comparing the Akaike information criterion (AIC) and Bayesian information criterion (BIC) scores of the five prominent random-intercept survival regression models (Exponential, Weibull, Lognormal, Loglogistic and Gamma). We aim to use the model with the lowest AIC and BIC scores, as that would best fit the data.
In the random-intercept Weibull hazard model, we included individual (level 1), family (level 2) and community (level 3) as the three levels. 16,964 older adults from 12,981 families were nested within 2352 communities, forming a hierarchical structure in our study sample. In India, older adults from the same families of the same communities are likely to share the same socioeconomic characteristics and household environment, which means the mortality risk might also be shared. Estimating mortality hazard using standard survival regression would overestimate the risk in this scenario, and using a multilevel framework becomes necessary 26,27 . The statistical description of the three-level random-intercept survival regression model is given below: Here, s k is the level 3 residual (group effect at community-level), c jk is the level 2 residual (group effect at family-level) and e ijk is the level 1 residual (individual level). h t ijk andh 0 t ijk are overall and baseline hazard of old-age mortality for ith persons belonging to the jth family of kth community. β 1 ,β 2 and β 3 gives the hazard coefficient of old-age mortality for the person-level, family-level and community-level independent variables, respectively, given the effect of all other independent variables and the group-level effects remains constant.
The random-intercept regression models provide the Intraclass Correlation Coefficient (ICC) and Median Hazard Ratio (MHR), which measures the mortality clustering of older adults within the families and the communities, respectively. The family-level ICC measures the correlation in mortality risk among older adults belonging to the same family of the same community 27,28 . It is calculated as 29 : where, σ 2 i , σ 2 f , and σ 2 c are the individual-, family-and community-level random-effect variance. Equivalently, the community-level ICC denotes the correlation in mortality risk among older adults of the same community 27,29 . It is calculated as: where the notations have the usual meaning. The ICC value lies between 0 and 1. The higher the value of ICC, the greater is the extent of mortality clustering at the respective levels.
Equivalently, the family-level (or community-level) MHR gives the median relative change in the hazard of the old-age mortality among all possible identical older adults pairs from two separate randomly selected families (or communities) that are ordered by mortality risk 30 . The family-level and community MHR is calculated as: where the notations have the usual meaning. The value of MHR is always greater than or equal to one such that the higher the value, the more is the heterogeneity in the old-age mortality risk across clusters. Further statistical details regarding the ICC and MHR are available from the cited references.
Further, the multivariable association between the independent variables and old-age mortality risk was shown using hazard ratios (HR). The HR gives the hazard of old-age mortality compared to the baseline mortality risk among older adults belonging to a particular category of an explanatory variable when the effect of other explanatory variables and the community-and family-level variability remain constant 23 .
Moreover, sensitivity analysis was performed by inspecting the presence of unobservable variable bias in the adjusted hazard ratios using the e-value method 31,32 . The e-value method gives the e-value statistic, which is defined as the minimum strength of association (on the hazard ratio scale) that an unmeasured confounder would need to have with both the treatment and the outcome variables after adjusting for the effect of other independent variables, such that the treatment-outcome variable association is nullified 31 . Therefore, the higher the e-value, the more robust is the corresponding hazard ratio to unobserved variable bias. The statistical significance of the e-value was determined from the CI limit (nearest limit to the null value of 1.00) 32 . The CI limit was 1.00 if the e-value was not statistically significant at the 5% level 32 .
We checked and found that none of the multivariable models violated the multicollinearity assumption 33 . Unfortunately, IHDS does not provide sample weight in the tracking sheet data, and the study results are unweighted. Statistical significance was determined at the 5% level unless mentioned otherwise. Statistical estimations were performed using the STATA 14 software 34 .

Explanatory variables.
Existing studies have shown several factors which explain the mortality among older adults 7,9,19,20 . We included these variables, conditional to their availability in the IHDS dataset. All the www.nature.com/scientificreports/ below-mentioned characteristics were measured for the older adults during wave-I. The individual-level variables related to the older adults include: We constructed these three community contextual characteristics by aggregating the information on the education level of individuals, BPL status of household and caste of the household to the community level, respectively. Prior to aggregation, we constructed binary variables of each of the three characteristics. Community education level was defined as the proportion of individuals with more than 10 years of schooling among all individuals in the community. The higher the proportion of educated individuals, the greater the community's education standard. The community poverty status was defined as the proportion of BPL households among all households in the community. A higher proportion of below poverty line households means a greater prevalence of poverty in the community. Further, community social standard was constructed as the proportion of Non-SC/ST households among all households in the community. Therefore, the higher the proportion of Non-SC/ ST households, the greater is the community's social standard. For ease of interpretation, we categorized the proportions into three categories-"low" (lowest 33rd percentile), "medium" (middle 33rd percentile), and "high" (highest 33rd percentile).
Additionally, we included the following community-level characteristics: (d) Type of community (urban, rural). (e) Geographical region (southern, western, eastern, central, north eastern, northern). The geographical regions divided India's erstwhile 33 states and union territories into six areas based on administrative divisions 39 .
Ethics approval and consent to participate. The present study utilized a publicly available secondary dataset with no information that would lead to the identification of the respondents. IHDS obtained the informed consent of respondents before the data collection. Therefore, no ethical approval was necessary for using these datasets. All survey methods were performed following the relevant guidelines and regulations.

Results
Sample description. Table 1 shows the characteristics of 16,964 older adults aged 60 years and above during IHDS 2005. Nearly 61% of older adults were aged between 60 and 69 years, and 50% were male. Nearly 6% and 4% of older adults had hypertension and diabetes, respectively. Moreover, one in ten older adults faced difficulty performing activities of daily living, one-fifth of older adults smoked tobacco, and 7% consumed alcohol. Further, six in ten older adults had no formal schooling, and 36% were widowed. While one-tenth of older adults lived in single generation households, 32% belonged to the lowest 40% wealth quintile households. Coming to the community context, we observed that 70% of older adults resided in rural areas, three in ten older adults belonged to communities with a high level of education and social standard. Further, 35% and 33% of children were from communities with low socioeconomic status and had a low maternal education level, respectively. In terms of population distribution, most older adults (33%) were from the Northern region, followed by the Southern (24%) region. Figure 1 shows the Mortality Incidence Rate (per 1000 person-years lived (PYL)) among subgroups of older adults for 2005-2012. The overall old-age mortality rate was 39 per 1000 PYL. The mortality rate was higher in male older adults (42 deaths per 1000 PYL) and those aged 80 years and beyond (98 deaths per 1000 PYL) compared to their counterparts from other sub-groups.
Bivariate analysis. Table 2 shows the average survival duration and the bivariate association of old-age mortality with the individual-, family-and community-level determinants. Most of the individual and household level factors in 2005 were associated with old-age mortality between 2005 and 2012. The community's education level, poverty status, and social standard were significantly associated with old-age mortality. Moreover, the mortality hazard was also significantly associated with the type and geographical region of the community.
Model specification. Table 3 shows the goodness-of-fit statistics for the Exponential, Weibull, Lognormal, Loglogistic and Gamma random-intercept survival regression models for old-age mortality. The Weibull regression models are the best fit as they have the lowest AIC and BIC scores among all the models. Further, Fig. 2 shows that the hazard of old-age mortality increases with the duration of observation. Therefore, the choice of the Weibull model is conceptually and statistically justified. Table 4 shows the family-and community-level effects from the random-intercept Weibull hazard models of old-age mortality, respectively. We calculated two regression models-the null model is an empty model without any covariates, and the full model includes all covariates (see "Statistical methods"). In both models, the variation in mortality risk at both family-and community-level was statistically significant. However, the family-level variation was at least twenty times higher than the community-level variation in both models. The family-level ICC for the full model shows a 61% correlation in the risk of mortality among older adults belonging to the same family of the same community (after adjusting for the individual-level, family-level and community-level characteristics). Moreover, the median hazard of mortality is 2.12 times higher (family-level MHR) between all pairs of high-risk and low-risk families. Additionally, the statistically significant Weibull regression parameter implies that the assumption of monotonically increasing mortality hazard with time is not violated. Contrary to the bivariate analysis, we find that the educational level, poverty status and social standard of community was not associated with mortality risk among older adults after adjusting for the effect of other independent variables and the community-level and family-level effects. However, older adults residing in communities from Northern www.nature.com/scientificreports/ did not suffer from unobserved variable bias. Upon observing family-level characteristics, it was evident that the association of family, household wealth quintile and poverty status with old-age mortality was not sensitive to omitted variable bias. After estimating the Weibull survival regression models, we obtained the adjusted cumulative hazard curve of old-age mortality grouped by lifestyle, social and economic characteristics (Fig. 3). Notably, the cumulative hazard curve in terms of smoking tobacco, drinking alcohol, level of education, family structure, the caste of household head and household wealth quintile were adjusted for the effect of other independent variables and the family-and community-level effects. We find that the graphs' results were in a similar direction to those obtained from the multivariable regression model.

Discussion
Today, with substantial health advancements worldwide, people can expect to live into their sixties and beyond. Longer life has provided opportunities for older people (such as pursuing their passion, education, new career) and opened their chance of contributing towards families and communities. However, the growing mortality and health risks during old age hinder such opportunities and contributions 40 . The present study reveals a significant loss in the old-age population between 2005 and 2012 with an unequal distribution of mortality risks across families and communities.
Using a follow-up survey from India, the present study shows that many high-risk families (mortality clustering in families) in India lose multiple members in 60 years and above ages. Although older adults share common characteristics among communities, the present study does not find any significant clustering of mortality at the community level. Even after adjusting the unobserved heterogeneity at family and community levels, mortality risk was higher among older male adults than female counterparts. Consistent with the previous literature, older adults with poor education and those residing in unemployed condition experiences higher mortality risk 13,41 . Long term consequence of widowhood status was prominent in the study as being widow brings higher mortality risks among older adults. Such misfortunate widowhood condition is also visible from extant Indian literature 42 . The possible explanation for such association includes the protective effect of marriages in social, psychological, economic and environmental support 43 . Household headship provides constant involvement and control on the household's social, financial affairs and a sense of security and authority. This might be the reason that household headship in older adults prevents long term mortality risk in the present study.
Ample evidence reveals an essential role of the social participation of older adults on long term survival as it may protect them from loneliness, depression, stress, or sadness of being away from loved ones 44 . However, in contrast to past evidence, the present study found an insignificant association between social participation in the first wave and mortality risk until the follow-up period. Such association might be possible due to the longterm window of observation. Since the older adults actively indulged in social activities might not continue due to poor health, leaving them in distress which can turn to a shorter lifespan. The health status of older adults is the prominent predictor among all the individual factors 10,45 . For instance, if an individual had poor health status in the first wave (i.e., chronic diseases or difficulty doing daily activities) then, better education, working status, marital status, or social participation will not be much helpful in reducing the long-term mortality risk until and unless they take early preventive measures.   www.nature.com/scientificreports/ Extended or joint families experience higher mortality risks among older adults. Such association is possible as the joint or extended families will have more older adults than a single or nuclear generation, making them vulnerable to mortality risks. Moreover, having no children can also be responsible for higher old-age mortality risks due to financial insecurity and loneliness 46 . Past evidence from India shows a socioeconomic disparity in older-age mortality which is also evident in this study; however, they were unable to show long term impact 20 . Consistent with a longitudinal study from Taiwan, the present study found that the most prosperous older adults and those living above the poverty line in the first wave enjoy a longer lifespan in the future 10 . Despite a higher proportion of older adults in southern regions of India, mortality risks were higher in central, northern and eastern areas of India. This might be possible due to individuals' better health care-seeking behavior in southern regions of India 47 . However, despite having a poor health care system in the north-eastern regions, the mortality risk remains lower in the present study. Surprisingly, this may be due to the family level factors acting as a protective shield for the older adults or a higher female population in older ages 48,49 . For instance, the solid biological advantage of females and satisfaction of being closer to family and community might help in lowering the mortality risk of north-eastern older adults.
Despite providing robust evidence of heterogeneity in older-age mortality risk at the family level and revealing the long-term effect of individual, household and community factors on older-age mortality risks, the present study has its limitations. Ample evidence shows that depression and life satisfaction are emerging as prominent indicators of a longer lifespan; however, we could not capture their effect due to the unavailability of information Table 3. Measures of goodness-of-fit for three-level random intercept survival regression models of mortality among older adults in India. AIC, Akaike information criterion; BIC, Bayesian information criterion. www.nature.com/scientificreports/ in the data used for this study 50 . Self-reported chronic conditions may create multiple problems in the form of biases like the accuracy of responses, so biological or clinical markers of chronic diseases should be considered while understanding the mortality dynamics of the individual 51 . The present study uses self-reported information of chronic diseases as the biological measures of older adults were unavailable. Additionally, the study results are unweighted and need to be interpreted accordingly.

Conclusion
In India, families are the prime source of caregivers for older adults. With significantly higher mortality risk heterogeneity across Indian families, the present study confirms that the familial-level factors (i.e., having children, income-level, poverty status and ethnicity) in early years of life may have a noticeable impact on the longer lifespan of older adults. Along with the individual-level factors (i.e., education, employment, support of a partner, social participation, and health behavior), health status in the form of chronic diseases and daily living activities remains to have a significant impact on the survival of older adults.
Past literature from developed countries shows no health gradient among rich and poor before the enlightenment of science and advanced technologies 2 . However, with the growing development of treatments and drugs, a wealthier population pays quickly to cure diseases and ensure a longer life. This trend continues in developing countries, too, combined with the lesser knowledge of health behavior among the uneducated population, increasing the disparities across socioeconomic statuses. Such past evidence and the detrimental effect of poverty and lower income in the present study confirms the unequal share of mortality distribution in old-age across families. The present study will help the policymakers understand the development of such a mortality gradient in the old-age population of India and provide efficient evidence of policy interventions across high-risk families. The long-term consequences of socioeconomic status and health conditions on old-age mortality risk further urge early life interventions as those started in late-life might have negligible impact on keeping the older adults alive and healthy.
Traditionally, joint or extended families were one of the characteristics of Indian life where older adults enjoy authority along with care from younger generations. However, changes in living arrangements and lifestyle in past years bring a shift towards the caregiver role in families. The emergence of new health conditions like life satisfaction, stress, and depression among wealthy and low-income families, further, urges future research on the old-age mortality risks in India. Table 4. Random-effect parameters from three-level Weibull random intercept survival regression models of mortality among older adults in India. (a) CI, confidence interval; (b) Null model, Model without any explanatory covariates; Full model, Model with all explanatory covariates; (c) Likelihood ratio tests were performed against single-level Weibull survival regression models with the same covariates respectively.

Measures
Null model Full model